[SPARK-9973][SQL]correct buffer size by viper-kun · Pull Request #8189 · apache/spark

viper-kun · 2015-08-14T06:20:34Z

When cache table in memory in spark sql, we allocate too more memory.

InMemoryColumnarTableScan.class
val initialBufferSize = columnType.defaultSize * batchSize
ColumnBuilder(attribute.dataType, initialBufferSize, attribute.name, useCompression)

BasicColumnBuilder.class
buffer = ByteBuffer.allocate(4 + size * columnType.defaultSize)

So total allocate size is (4+ size * columnType.defaultSize * columnType.defaultSize), We should change it to 4+ size * columnType.defaultSize.

viper-kun · 2015-08-14T06:21:16Z

@liancheng @scwf is it OK?

liancheng · 2015-08-14T07:18:22Z

Could you file a JIRA ticket and update the PR title to [SPARK-XXXX] [SQL] <title>?

JoshRosen · 2015-08-15T19:54:33Z

Jenkins, this is ok to test.

SparkQA · 2015-08-15T22:25:57Z

Test build #40973 has finished for PR 8189 at commit 6741f23.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

liancheng · 2015-08-16T04:46:33Z

Thanks, I'm merging this to master.

@rxin Is it OK to have this one in branch-1.5 at this time?

liancheng · 2015-08-16T07:08:48Z

@viper-kun Please add your name and email address to GitHub so that we can include that information while merging your PRs. I've added your information gathered from JIRA by hand this time.

rxin · 2015-08-16T07:57:06Z

This is fine for 1.5.

The `initialSize` argument of `ColumnBuilder.initialize()` should be the number of rows rather than bytes. However `InMemoryColumnarTableScan` passes in a byte size, which makes Spark SQL allocate more memory than necessary when building in-memory columnar buffers. Author: Kun Xu <viper_kun@163.com> Closes #8189 from viper-kun/errorSize. (cherry picked from commit 182f9b7) Signed-off-by: Cheng Lian <lian@databricks.com>

The `initialSize` argument of `ColumnBuilder.initialize()` should be the number of rows rather than bytes. However `InMemoryColumnarTableScan` passes in a byte size, which makes Spark SQL allocate more memory than necessary when building in-memory columnar buffers. Author: Kun Xu <viper_kun@163.com> Closes apache#8189 from viper-kun/errorSize.

recorrect buffer size

6741f23

viper-kun changed the title ~~correct buffer size~~ [SPARK-9973][SQL]correct buffer size Aug 14, 2015

asfgit closed this in 182f9b7 Aug 16, 2015

viper-kun deleted the errorSize branch January 18, 2017 09:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-9973][SQL]correct buffer size#8189

[SPARK-9973][SQL]correct buffer size#8189
viper-kun wants to merge 1 commit intoapache:masterfrom
viper-kun:errorSize

viper-kun commented Aug 14, 2015

Uh oh!

viper-kun commented Aug 14, 2015

Uh oh!

liancheng commented Aug 14, 2015

Uh oh!

JoshRosen commented Aug 15, 2015

Uh oh!

SparkQA commented Aug 15, 2015

Uh oh!

liancheng commented Aug 16, 2015

Uh oh!

liancheng commented Aug 16, 2015

Uh oh!

rxin commented Aug 16, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

viper-kun commented Aug 14, 2015

Uh oh!

viper-kun commented Aug 14, 2015

Uh oh!

liancheng commented Aug 14, 2015

Uh oh!

JoshRosen commented Aug 15, 2015

Uh oh!

SparkQA commented Aug 15, 2015

Uh oh!

liancheng commented Aug 16, 2015

Uh oh!

liancheng commented Aug 16, 2015

Uh oh!

rxin commented Aug 16, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants